Search CORE

537 research outputs found

Computable and Faithful Lower Bound for Entanglement Cost

Author: Jing Mingrui
Wang Xin
Zhu Chengkai
Publication venue
Publication date: 17/11/2023
Field of study

Quantum entanglement is a crucial resource in quantum information processing. However, quantifying the entanglement required to prepare quantum states and implement quantum processes remains challenging. This paper proposes computable and faithful lower bounds for the entanglement cost of general quantum states and quantum channels. We introduce the concept of logarithmic

k

-negativity, a generalization of logarithmic negativity, to establish a general lower bound for the entanglement cost of quantum states under quantum operations that completely preserve the positivity of partial transpose (PPT). This bound is efficiently computable via semidefinite programming and is non-zero for any entangled state that is not PPT, making it faithful in the entanglement theory with non-positive partial transpose. Furthermore, we delve into specific and general examples to demonstrate the advantages of our proposed bounds compared with previously known computable ones. Notably, we affirm the irreversibility of asymptotic entanglement manipulation under PPT operations for full-rank entangled states and the irreversibility of channel manipulation for amplitude damping channels. We also establish the best-known lower bound for the entanglement cost of arbitrary dimensional isotropic states. These findings push the boundaries of understanding the structure of entanglement and the fundamental limits of entanglement manipulation.Comment: 25 page

arXiv.org e-Print Archive

HFORD: High-Fidelity and Occlusion-Robust De-identification for Face Privacy Protection

Author: Chen Dongxin
Gao Xinbo
Wang Nannan
Zhu Mingrui
Publication venue
Publication date: 15/11/2023
Field of study

With the popularity of smart devices and the development of computer vision technology, concerns about face privacy protection are growing. The face de-identification technique is a practical way to solve the identity protection problem. The existing facial de-identification methods have revealed several problems, including the impact on the realism of anonymized results when faced with occlusions and the inability to maintain identity-irrelevant details in anonymized results. We present a High-Fidelity and Occlusion-Robust De-identification (HFORD) method to deal with these issues. This approach can disentangle identities and attributes while preserving image-specific details such as background, facial features (e.g., wrinkles), and lighting, even in occluded scenes. To disentangle the latent codes in the GAN inversion space, we introduce an Identity Disentanglement Module (IDM). This module selects the latent codes that are closely related to the identity. It further separates the latent codes into identity-related codes and attribute-related codes, enabling the network to preserve attributes while only modifying the identity. To ensure the preservation of image details and enhance the network's robustness to occlusions, we propose an Attribute Retention Module (ARM). This module adaptively preserves identity-irrelevant details and facial occlusions and blends them into the generated results in a modulated manner. Extensive experiments show that our method has higher quality, better detail fidelity, and stronger occlusion robustness than other face de-identification methods

arXiv.org e-Print Archive

Statistical Analysis of Quantum State Learning Process in Quantum Neural Networks

Author: Jing Mingrui
Wang Xin
Zhang Hao-kai
Zhu Chenghong
Publication venue
Publication date: 26/09/2023
Field of study

Quantum neural networks (QNNs) have been a promising framework in pursuing near-term quantum advantage in various fields, where many applications can be viewed as learning a quantum state that encodes useful data. As a quantum analog of probability distribution learning, quantum state learning is theoretically and practically essential in quantum machine learning. In this paper, we develop a no-go theorem for learning an unknown quantum state with QNNs even starting from a high-fidelity initial state. We prove that when the loss value is lower than a critical threshold, the probability of avoiding local minima vanishes exponentially with the qubit count, while only grows polynomially with the circuit depth. The curvature of local minima is concentrated to the quantum Fisher information times a loss-dependent constant, which characterizes the sensibility of the output state with respect to parameters in QNNs. These results hold for any circuit structures, initialization strategies, and work for both fixed ansatzes and adaptive methods. Extensive numerical simulations are performed to validate our theoretical results. Our findings place generic limits on good initial guesses and adaptive methods for improving the learnability and scalability of QNNs, and deepen the understanding of prior information's role in QNNs.Comment: 28 pages including appendix. To appear at NeurIPS 202

arXiv.org e-Print Archive

Diff-Privacy: Diffusion-based Face Privacy Protection

Author: Chen Dongxin
Gao Xinbo
He Xiao
Wang Nannan
Zhu Mingrui
Publication venue
Publication date: 11/09/2023
Field of study

Privacy protection has become a top priority as the proliferation of AI techniques has led to widespread collection and misuse of personal data. Anonymization and visual identity information hiding are two important facial privacy protection tasks that aim to remove identification characteristics from facial images at the human perception level. However, they have a significant difference in that the former aims to prevent the machine from recognizing correctly, while the latter needs to ensure the accuracy of machine recognition. Therefore, it is difficult to train a model to complete these two tasks simultaneously. In this paper, we unify the task of anonymization and visual identity information hiding and propose a novel face privacy protection method based on diffusion models, dubbed Diff-Privacy. Specifically, we train our proposed multi-scale image inversion module (MSI) to obtain a set of SDM format conditional embeddings of the original image. Based on the conditional embeddings, we design corresponding embedding scheduling strategies and construct different energy functions during the denoising process to achieve anonymization and visual identity information hiding. Extensive experiments have been conducted to validate the effectiveness of our proposed framework in protecting facial privacy.Comment: 17page

arXiv.org e-Print Archive

CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization

Author: Dong Shiyin
Gao Xinbo
Wang Nannan
Zhao Ruoyu
Zhu Mingrui
Publication venue
Publication date: 30/11/2023
Field of study

We propose CatVersion, an inversion-based method that learns the personalized concept through a handful of examples. Subsequently, users can utilize text prompts to generate images that embody the personalized concept, thereby achieving text-to-image personalization. In contrast to existing approaches that emphasize word embedding learning or parameter fine-tuning for the diffusion model, which potentially causes concept dilution or overfitting, our method concatenates embeddings on the feature-dense space of the text encoder in the diffusion model to learn the gap between the personalized concept and its base class, aiming to maximize the preservation of prior knowledge in diffusion models while restoring the personalized concepts. To this end, we first dissect the text encoder's integration in the image generation process to identify the feature-dense space of the encoder. Afterward, we concatenate embeddings on the Keys and Values in this space to learn the gap between the personalized concept and its base class. In this way, the concatenated embeddings ultimately manifest as a residual on the original attention output. To more accurately and unbiasedly quantify the results of personalized image generation, we improve the CLIP image alignment score based on masks. Qualitatively and quantitatively, CatVersion helps to restore personalization concepts more faithfully and enables more robust editing.Comment: For the project page, please visit https://royzhao926.github.io/CatVersion-page

arXiv.org e-Print Archive

All-to-key Attention for Arbitrary Style Transfer

Author: Gao Xinbo
He Xiao
Wang Nannan
Wang Xiaoyu
Zhu Mingrui
Publication venue
Publication date: 06/04/2023
Field of study

Attention-based arbitrary style transfer studies have shown promising performance in synthesizing vivid local style details. They typically use the all-to-all attention mechanism -- each position of content features is fully matched to all positions of style features. However, all-to-all attention tends to generate distorted style patterns and has quadratic complexity, limiting the effectiveness and efficiency of arbitrary style transfer. In this paper, we propose a novel all-to-key attention mechanism -- each position of content features is matched to stable key positions of style features -- that is more in line with the characteristics of style transfer. Specifically, it integrates two newly proposed attention forms: distributed and progressive attention. Distributed attention assigns attention to key style representations that depict the style distribution of local regions; Progressive attention pays attention from coarse-grained regions to fine-grained key positions. The resultant module, dubbed StyA2K, shows extraordinary performance in preserving the semantic structure and rendering consistent style patterns. Qualitative and quantitative comparisons with state-of-the-art methods demonstrate the superior performance of our approach

arXiv.org e-Print Archive

Enhancement of non-Stabilizerness within Indefinite Causal Order

Author: Jing Mingrui
Liu Zhiping
Mo Yin
Wang Xin
Zhu Chengkai
Publication venue
Publication date: 26/11/2023
Field of study

In the field of quantum computation, the non-stabilizerness of a quantum circuit is crucial for understanding and quantifying quantum speed-up. In this work, we explore some intriguing phenomena regarding the non-stabilizerness of a circuit when a Quantum SWITCH structure is employed. This structure is a novel quantum construct that enables quantum states to pass through operations in a superposition of different orders and has shown superiority in numerous tasks over circuits with a definite causal order. Firstly, we discover that the completely stabilizer-preserving operations, which cannot generate magic states under standard conditions, can be transformed into a resourceful operation capable of generating magic states when processed by the Quantum SWITCH. Secondly, when considering the effects of noisy channels on operations, we observe that while the non-stabilizerness of each path may be annihilated, their superposition could still preserve the non-stabilizerness of the operation. These findings reveal unique properties brought by the Quantum SWITCH and open further avenues in future research on magic resources of general quantum architecture.Comment: 5+4 pages, 4 figure

arXiv.org e-Print Archive

VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Author: Cheng Kun
Cun Xiaodong
Wang Jue
Wang Nannan
Wang Xuan
Xia Menghan
Yin Fei
Zhang Yong
Zhu Mingrui
Publication venue
Publication date: 27/11/2022
Field of study

We present VideoReTalking, a new system to edit the faces of a real-world talking head video according to input audio, producing a high-quality and lip-syncing output video even with a different emotion. Our system disentangles this objective into three sequential tasks: (1) face video generation with a canonical expression; (2) audio-driven lip-sync; and (3) face enhancement for improving photo-realism. Given a talking-head video, we first modify the expression of each frame according to the same expression template using the expression editing network, resulting in a video with the canonical expression. This video, together with the given audio, is then fed into the lip-sync network to generate a lip-syncing video. Finally, we improve the photo-realism of the synthesized faces through an identity-aware face enhancement network and post-processing. We use learning-based approaches for all three steps and all our modules can be tackled in a sequential pipeline without any user intervention. Furthermore, our system is a generic approach that does not need to be retrained to a specific person. Evaluations on two widely-used datasets and in-the-wild examples demonstrate the superiority of our framework over other state-of-the-art methods in terms of lip-sync accuracy and visual quality.Comment: Accepted by SIGGRAPH Asia 2022 Conference Proceedings. Project page: https://vinthony.github.io/video-retalking

arXiv.org e-Print Archive